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ABSTRACT ‘ 

The question of whether test factor structure is 
indicative of the test iten hierarchy yas examined. Data fromq1,900 
subjects on two sets of five bivalued“Law School Admission Test, 

. items, which were analyzed with latent trait methods of Bock and , ° * 
Lieberman and«wf Christoffersson in Psychometrika, were analyzed with 

an ordering-theoretic method to locate item hierarchies. Though one 

item set was unifactor and the other bifactor, both item Sets showed 

no hierarchies even after 19 percent of the. subjects responsible for 

low respons? pattern frequencies-were deleted. Factor structure of a |. 98% 
test appears to reveal nothing about the test's hierarchical 

structure. (Author) oe: 
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SER PERMTESION OF tHE Pv BaCe? - 4 
wes ! 


A central issue in psychometrics is the analysis of the: structure 
" gmong the dichotomous items in a test. Test structtre is conceived 18 
two biaate detent 1) test factor structure vitch ie ‘the system of latent 
traite (factors or dimensions) which the items measure and 2) test 
iy. ‘hierarchical structure which is- the network of prerequisite relations 
| among the irom. To atudy these types of test structure, ee approaches ~ 
rm test data analysig have developed: 1) latent trait statistical theory 
Ly and 2) sedation theory. The first approach eibviges infotmation as to 
. how iad Latent traits are needed to describe tha correlational 


relationships among test items as _ as provide information as to 
¥ Aawach int item astiiwisties and pagans ability ‘estimates. The second 
~ approach provides information as to ier hierarchical structure that best | 
describes the system df- logical relationships -among test items - Ms 
Latent trait etatistical test theory ha@ its roots in the modef for 
dichotomous item responses developed by Lawley (1943). , This model,. in 
“turn, formed the basis for “"notmal ogive" model, which has been further 
. nereieres by Lord (1952, 1953), Lord & Novick (15968), and Semejima (1969). 
* Bock and Lieberman (1970) and Christoffersson (1975) have provided davthae 
nelingnents in the normal betye model for the. analyte of dichotomously 
scored itene from a ceatderation of test factor etiegtare. 
ordering theory, on the other fend, is a measurement model wmniek has 
as its primary, intent either the testing of hypothesised hierarchies among 
* bivalent items og the determination of hierarchies among bivalued ee 
Fi Methods of ordering theory nave been described ne a book of published. reports 
in ordering theory fxrus. bate: and Airasian, 1975). Applications of 


ordering theory have been made in the investigation of Piagetian task. 
: \ 4 
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hierarchies (Bart and’ Airastfan, 1974) instructional sequences (Airasian 


and Bart, 1975), attitudes toward s@hool (Airasian, Madaus, and Woods, 1975); 


% 


. 


and other educational psychological topics. 


Although any test of ‘bivalent items has a factor structure and ' 


a hierarchical itrdcture, the relationship between the two types of 


test structure has not yet been examined. 


Are the two types of ‘structures 
, : . 


redundant? Is there a correspondence between hierarchical structures. and 
factor structures such as that items fitting'a n-factor structure would 
have a hierarchy with n branches and vice versa? Or are the two types of 


structures unrelated by such a correspondence and thus are psychometrically 
: * : : 
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distinct? : ees . 


, t 


In behavioral sciences such“as ingtructional paychology and the . . 
psychology of inteHigance, there have developed models which posit ar f 


~ 


psychological vartabice as continuous dimensional teeite * g., Guilford, 1956) 


ond models which! posit psychological variables as leaded ‘of behaviors 


wand skildg which have ee ites structures (e.g, Piaget, 1950; 


ace. 1968). -Along with those™ types of models have developed neatiawus 
-" ’ 


measurement models such as factoy analysis and discrete measurement aoliik 


such ad ordering theory which have contributed to the empirical examination 
of various examples of the two types-of psychological tad To study 

the relationship between hierarchical structures and factor structures is 

to stucy the rilatioodhiy between discrete. and continuous measurement models — 
and thus to examine the relationship between discrete and psychological 
theories. Herein lies the relevance of this study. Are the two’ types of 


behavioral science theories and their corresponding measurement models., 


« r] ‘ 
closely related or is there a conceptual ehasm between them? As-a step in 


; iar ; aa. a ‘ 
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the answer to that question, the éattlee question will be investigated 
in this study: can the factor structure ofja test tell us something about 


the hierarchical, sttucture® , * 


NBTHOD . ie oe ‘ 
e ¢ ‘ ° 
Four methoc 3 of bivalent item analysis » were compared on a single 
set oistost fora, ‘Two nethods come from hatent trait Statistical ae 
si P 


theory: 1) the Bock-Lieberman (1970) unconditional naximum ‘Likelihood 
method of estimating parameters of the ‘normal ogive model for dichotgnoubly- 
scored teen response patterns and 2) ‘the Christof fersson (1975) generalized 
least square method for multiple factor analysis of.dichotomized variables. 
Two methods come ‘tewladderine theory: "* the Krus-Bart (1974) ordering- 
theoretic seid of engeeel atone multidimensional scaling of bivalued P 


items and 2) an prdering-theoretic method,’ derived from the methods of 


» 
Airasian and Bart (1973) and of fart, and’ Krus (1973), to identify the 


ce 


chierarchy“of test {roms that best fits bivalued test, item data so that item 


intransitivéties cannot occur. : > oo ‘ 
. e a ‘ . 
In the normal ogive model .used by Bock and Lieberman li the 


icine of wicdene Ke an item is viewed as ‘a normally-distributed 


Function of the difficulty aa discrimipatiag power of the.item ana ‘the. . es 


,latent ability of /the’ subject. Their’ method was an " @necndttionsd maxiqeum 
” , J 
likelihood one, because the data was ‘regarded ss coming from’ « a, sample" ‘of 
* subjects from a specified population and Sazavul item parameters were 
{ 


estimated from integration of the probability function over the distribution 
so > ; 


of latent ability. Their method provided also a goodness of fit test for 


the normal cgive framework. - f 


rake 


‘ 


In the Christoffersson (1975) method, the three types of parameters 
considered in the previous paper were again considered. However, the marginal - ik 
‘distributions’ of single and pgirs of items are used to allow for a technique 


‘ ‘ 
that permits more than 10-12 items to be analyzed which was the Limit -fot the 


previous study. In addition, ‘a goodness of fit test was steric 

' In the Krus-Bart (1974) method, the total tested sample of subjects is / 
divided into disjoint, mutually pxeluaton Haven such that the response patterns : 
of subjects in any class complies aith a linear hierarchy -among the items and. 


‘ : 4 ; 
4 that. the numbers of response patterns for the classes decrease in a monotone 


manner. The numberof Ol and 10 scores relating each item and the, other jtems 
3 : ( ' 


¥ 


- are counted in each class to, produce a ‘non-corrélational, equivalent to the 
factor loading relating an item and the underlying: trait. In this method, 


| + each class of subjects determines a ao and its corresponding factor 
| » 


| loadings. Bart and Krus (19%3) recommended that nierarchiea be built from the 


inter-item prerequisite. relations, .An item i is prerequisite to item jdf and 


only if the frequency of 01 item response patterns for items i and j respectively - 


+¥ 


\ is equal -to or smaller than some pre-established tolerance level. If the. 
tolerance Level is ae than 0, then item intransitivities (1.e., if. item 
{' is prerequisite to “item 4 and item + ta cebreqalanee to item k, then item : 
i * ‘ 
' s i is not prequisite to’ item k) -are possible, However, trahsitivity is a basic 
, property at a hierarchy; therefgre, a nethad whieh» allows no teen 


‘ ‘ « 
‘ 


intransitivities is Preterale: The method seed aa this study to seabaneiar 


- ’ 


_ bivalent test item hiecarchivat structure had two phases: 1). Stew, response ” 


terns, ‘vhose ee were less than or equal to some minimum and whose = ; 
ined frequencies were less than ‘or equal te, some See percentage 4 
; 


of the sample elke, are ddlaned as suggested by Atraciba ‘ind Bact (1973) and 


2) inter-item prerequisite, griations are derived from’ he Peesnine item 6 


Tesponse savecens using the method of Bar aid Krus (1973) with a éciel ace 
Suhre ‘ 


’ 


_ hierarchy structure. If item i is a prerequisite to item j and if the 


‘ the O1 and 10 ‘cell entries ‘are zero, thén the phi coef fictent;w41l be. 1. 
4 ‘ : 2 / 


than the’ test hierarchy for the bifactor items. ad aa such groups of 


: later reanalyzed by Pitntateeon Ra97a)s Ohe group of test items 


‘level of 0. This composite method allows no item intransitivities and. 


‘only zero entry in the table is in the 01 cell for {pene 1 and j 


“Thus, logical equivalence between items implies a unifactor structure, but 


\ ast items wete uséd in the Bock and Lieberman wee (1970) and were : 


7 ; has 


‘ \ 


incorpofates noiinfrequeritly occurring response patterns. 


oa 


The four-fold table rélating two bivalent itens used in thé composite 


method can also be used, to note a relationship between factor structure and. 
+ * % 


. ) 
respectively, then the phi coefficient will be large to the extent that the 
2?” . ( ‘ 
10 cell entry is small. If item i is logically equivalent to item j and if 
2 ’ 


—¥ 


a prerequisite relation between’ items only suggests strongly a unifactor 
= : 


structure. With respect leo other two item hierarchical structures, no firm 
4 3 
correspondence between factor and hierarchical structures was identified. 


To atuey the question "Can the factor structure of a test tell us 


. 


sonsthies about the: hierarchical structure? ", a set of bivalued’ items Savine 


a unifactor structure and a set of items having a bifactor structure were 
: /. # ° 


located to determine whether the unifactor items engendéred a test 3 


hierarchy with fewer items and thus more inver-item prerequisite relations 


. consisted of five highly homogeneous Pewee Classification items with RR, o@ 


; 


| /.880° from the Law School Admission Test (sat). Frow | a conceptual analysis ~ 


of the. baa vi inter-item prerequisite relations were hypothesized. The 
other group of, Saat items consisted of five somewhat more heterogeneous 


Debate items with KR497- 765 from the same LSAT. . ‘hind inter-item 


prerequisite relations were not evident. - AlI items were in the «five alternative 


: multiple choice format and were dichotomously scored. The sample for this . 


data was’ 1000 subjects drawn from‘a larger Sample of ‘subjects applying for 


as -6- 1 y t ‘ . 


‘ ‘ 


admissio at various ) American universities. The data sample was stratified ; 


with respect, ty university and rian level within universities. The 


s 
’. raw, item. resporibe data are Sieuapnbad in Table l. Positive features of this 


"data as reported by Christoffersson (1975) include: a unifactor structure 


. Ho , 
_.for the first-set of items and a bifactor structure for the yecond set of é 


items. Other positive features are that the data ‘involved few-items, a large 


’ 
subject sample, and a systematic stratified sampling plan for esponse 


pattern selection. Such features ‘are ideal for oxdering theory, partly 
¢ . 


because hierarchy determination is very dependent on the er of patterns 


used in the analfsis. © The greater the sample size is in. lation to the 
® ad 


jtotal number of possible item response patterns. (2" with'n being the number P% 
af items), the greater thé likelihood is that, the response patterns actually _ 
involved T the est item hierafchy will have greater: frequencies than the 


response patterns attributable ‘to chance or error. a a, et 2 ‘ 


. F 
‘ 


ee _. RESULTS AND DISCWSSION 

beak and Lieberman. (1970) used sree aiaes approximation .for a 7 

Likelihood ratio shebigiiaiia to aii ‘that the Figure Classification items 
_ fitted a unifactor erry (x. 202, 28, ¢. fe21, hdxps. 50), but that the ‘ 

Debate items did not. (x 2231, a d.f.=21, .O05<p<, 10) « Christof fersepn (1975) ° 


’ io 
yping a similar chi-gquare approximation also exinerchsigh that thie Figure ma 


Classification items fitted a tac model ? =5.02, a.t.e5, 40<p<.'50), 7 ae 
| Ce 
and’ that. the Debate {tema did me (x7=10.30, d:feeSy 205 ép<.10). 
; ey 
Christoffersecn’ ovesse sed that the Debate items fateed 


a ' 


eenacese model. ‘oes. 63, 4. f. “I, 40<p<a45) 


Sais 
re ah Car 
F -7)- 
id ‘ 
‘ a ae | aahhs 1* 
LSAT:* Observed frequencies for response. patterns eo — : 
+ ; P - q ‘ 8 
\ . a ' : F 
SS — : ‘aa ; x ry : ; ’ A 
Pattern Figure Classification j Pattern’. Debate * 
> Item ' | cumulative “Item, re +; cumulative 
12345 frequency | frequency (£234 Y | frequency.) frequency : 
01010 eae 0 00010 “1 fj. % 
01100 0 ae) 00100 3 4 - 
60100 1 x F - OOL10 ae 7 
00101 Lo 3 2 “oro1o°* 3 10 . 
01000 i “3 01001 5 15° 
00010 2 5 11000 “6 21° 
0-111,0 2 nae 00011 "9 28 
700000 3 10 0.1 G11. i 35 
00.110 3 Ge o1foo . 2 2 ; 
0 O1101. 3 “16 10000 ik Je bs x 
10100 ar 19 11010 z:) 6 
00111 4 23 01120 8 64 
“00001 6 29 01000 1 4 
. o1001 8. 37 190010 1) y 85 
. ED GOO 10 47 00000 . IR es 
~ # ' 00011. 11 58 10100 14 / 11 ( 
Se 11100 11° (69 i te ie ie 15 4%. 
- 10010 14 83 rO02L1 17. ‘\143 
o\ 10110 15 98 111090 18 161 - 
N oaril 15 4 13 00004 19 180 
61024 16 129 , 0201 19 199 ‘ 
M000 16 145 O1L1LOL, 23 222 
1DQ10: 21 166 41001 25 247 
10101 28 194 -ObL11 - 28 275 
1L1 1% 28 222 i470 32 307. ’ 
-10001>» 29 251 ‘ peas 64 341 
P1002 \. 56. 307 2 Ld 35 376 “ 
1-510 ke. 61 ~ 368 10001 39° ~ 415 ©, ; 
10111. 80 448 10101 51 \ 466° , 
10011 81 529 10111 90 . 556 
11011 173. 702 11101 136 * 692 
eh oe ee 298 * 1000 Hs Wee Te | 308 1000 
= t 
*Adapted from data table analyzed by Bock and Lieberman (1970). - 
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‘The test data were analyzed with theg Krus-Bart ail method 


eq 
wv 


of multidimensional scaling in which linéar orders among items determiped by 


sib-eamnite. of subjects are conceived as factors or Secank traits for 


a 


the items. \These results are reported in Table 2. 


) 


The Figure Classification item Wate dt found to establish one prominent 
- inear order accounting for 59.4% of the subjects. The Debate items were . 
found to have one prominent linear order accounting for 56.5% oe thie nib jects. 
Second’ linear orders accounted for 13.4% and 14.2% of the subjects for the , 
two iva ates respectively. The first four linear orders accounted for 90.5% 


4 


‘and 85.3%, of the subjects for ne two data sets respectively. Though there 
are nd “h 


tistical testq attached to the Krus-Bart acaling mettiod, the analyses 


° 


of the-two WatA sets present a similar picture ‘of one, prominent first linear 
order and markedly less prominent secondary linear sebiices puts the Krus- 
Bart method stiouslen no clear-cut uniorder ‘or biorder structure to either 
data set: oar ra 


The data sets were then ‘analyzed with the composite method to determine 


item hierarchies through the determination of inter-item prerequisite rélationd. 
1 * ‘ : ce 
1. 3 


-K¥em response EAREELEP with tHe lowest frequencies whose combined aw 


aS 
were ‘equal to or leds than 3% of oe sample, or 30 ee weré deleted and © 


Xx 


inter-item Soden tates relations were sought; none were found. The procedure 
was repeated with the removal of 10% of ie tue a, or 100 subjects, which 
: ‘ ‘ : 
determined cutoff frequencies of 15 and ‘Ye for the ‘two data sets respectively; 
again no prerequisite relations were found - i.e., no item was vdeated to be 
puerequisite to any other. tren. “Only. after 129 siijeces, of 12.9% of the sanple, 
and 21 distinctly afferent response patterns, whose frequencies were less 
than 17, were deleted frog analysis, was even one inter-item csane 2 


relation located among the Figure Classification items. Similarly, only 


40 


Table 2 


rt 


“ Latent structures for LSAT items using Krus-Bart method 
¢, a ‘ 


_ Figure 
Classtfication , _Factors 
ptens . ; 
Zz. ' 


562 221 


173225 


stbject 


frequency 


. Factors 

I vIL 
{ 455 202° 
'f36 0 
| 238 90 
0 191. 


431 158) 


ry , 
364 332 158 108 


subject a ‘ ar 
‘frequency! 565 142 \76- 70 67 25 18 


! 
} 


-10= 
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f "after 126 subjects, or 12.6% of the sample, and 17 distinctly different 


Me 


response patterns, whose frequpncies were less than 15, were deleted from 
ores ’ 
analysis, was even one inter-item prerequisite relation located among the 


Debate -items, Both sets of items indicated no hierarchical structure, thus 


~ 
bs 


confirming the hypothesis of no prerequisite relations derived from 


- 
‘ 


conceptual analysis of the items, because no dnter-item prerequisite relations. 
» were indicated even when 10% of the most infrequently occurring response 


patterns were deleted. Further information on the hierarchy analysis could 


4 ‘ 


be generated, but thdse results would be pale in, comparison to the main 
result that, in this case, bivalent items which had either a unifactor or a 


bifactor structure showed no hierarchical structure. If one assumes that 
f ' » 


. } 
absence of a factor structure {mplies absence of a hierarchical structure, 
then only from the ordering-theoretic scaling of the items which indicated 


that the two sets of items had no clear-cut factor structure, could one. 


have expected no clear-cut hierarchy for either set of test items. 
s 

ws SUMMARY 

The purpose of this study was to determine whether pivalent items - 


‘complying to a unifactor structure produce an item hierarchy with fewer ; 
b ranches and more inter-item presequisite relations’ than bivalent items : 
complying to a bifactor structure. No such difference was found; in fact, 
both sets" of, iteds were found to have hierarchical speaks, Highly 

homogeneous items testing for one latertit trait nee found to be logically ‘ 


independent of each other in terms of indicating no prerequisite inter-, 
® Fi 


dependencies. Less homogeneous ‘items fitting a two latent trait model were 


” 


also logically independent. of each other. Partly because we would expect ' 


many cases of multifactor tests engendering no hierarchical structures, 


this study indicates that ‘the factor’ structure of a test does not necessarily 


Ee ee | ; 


ae ; -il- 


indicate anything about, the hierarchy of a test. ; y 


Certain unresolved issues emanate from this study: 1 is there a 


a 


conceptual chasm between latent trait statistical test theory and ordering 
s 


theory, between the quest for latent traits and the quest for item 


hierarchies? - 2) is there some connection that will allow information 


t 


from one approach to be aigaread into infornatién yithin the other approach?" 
It would bé helpful 1f response patterns with their frequencies could be 
reported in any: test results, to allow alternative ‘tite analyses. But 

much test information in education is in the form of completed analysis 
results. Thus, it would be helpful to edugators 1 such test results could 
be converted intg psychometric information relevant from another perspective. 
Such efforts & “saint the informativeness and multi-usefulness of test 


results would inexorably be tied to gains in the synthesis of psychometric 
methods. f ¥ 


i «oe 
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